Skip to content

Event log rotation #1997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open

Event log rotation #1997

wants to merge 40 commits into from

Conversation

ffakenz
Copy link
Contributor

@ffakenz ffakenz commented May 8, 2025

  • Event log rotation enable hydra heads to recover faster after a failure.

  • Rotating an event log means switching where the EventSink writes events to and where the EventSource reads events from.

  • File-based persistence now rotates event log files using a monotonically increasing index named logId.

  • Added a new run option to enable rotation after a given number of events:
    "The number of Hydra events to trigger rotation (default: no rotation)"

  • The base persistence event log file has been renamed from /state to /state-logId (starting from 0) to support switching between rotation modes (on/off).

  • The latest /state-logId file is used on startup (the one with the highest logId).

  • On startup, depending on the rotation config used, the latest event log might be rotated and the logId index will get incremented.

  • Note: the default Server API event sink does not support rotation.

  • Added new server output to allow 3rd party agents to detect the checkpoint and trigger any appropriate archival / backup / cleanup needed, without interrupting the hydra head.


  • CHANGELOG updated or not needed
  • Documentation updated or not needed
  • Haddocks updated or not needed
  • No new TODOs introduced or explained herafter

@github-project-automation github-project-automation bot moved this to Triage 🏥 in ☕ Hydra Team Work May 8, 2025
@ffakenz ffakenz moved this from Triage 🏥 to In progress 🕐 in ☕ Hydra Team Work May 8, 2025
@ffakenz ffakenz linked an issue May 8, 2025 that may be closed by this pull request
@ffakenz ffakenz added the 💬 feature A feature on our roadmap label May 8, 2025
@ffakenz ffakenz self-assigned this May 8, 2025
@ffakenz ffakenz requested a review from a team May 8, 2025 10:46
@ffakenz ffakenz added this to the 0.22.0 milestone May 8, 2025
Copy link

github-actions bot commented May 8, 2025

Transaction costs

Sizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using arbitrary values and results are not fully deterministic and comparable to previous runs.

Metadata
Generated at 2025-05-29 14:14:19.479758186 UTC
Max. memory units 14000000
Max. CPU units 10000000000
Max. tx size (kB) 16384

Script summary

Name Hash Size (Bytes)
νInitial c8a101a5c8ac4816b0dceb59ce31fc2258e387de828f02961d2f2045 2652
νCommit 61458bc2f297fff3cc5df6ac7ab57cefd87763b0b7bd722146a1035c 685
νHead be6ebc744208c660bf0fdc1cfbb5157477cd305de5b1777e575cbb4c 14665
μHead 1f47a42d1d6edc32ccd834acb19d5db3b2a5232f0bd7eaa8908dc519* 5284
νDeposit ae01dade3a9c346d5c93ae3ce339412b90a0b8f83f94ec6baa24e30c 1102
  • The minting policy hash is only usable for comparison. As the script is parameterized, the actual script is unique per head.

Init transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 5834 10.35 3.28 0.51
2 6038 12.70 4.03 0.55
3 6238 14.76 4.67 0.58
5 6638 18.80 5.94 0.64
10 7647 29.06 9.16 0.79
43 14282 99.11 30.98 1.80

Commit transaction costs

This uses ada-only outputs for better comparability.

UTxO Tx size % max Mem % max CPU Min fee ₳
1 561 2.44 1.16 0.20
2 737 3.38 1.73 0.22
3 923 4.36 2.33 0.24
5 1279 6.41 3.60 0.28
10 2170 12.13 7.25 0.40
54 10060 98.61 68.52 1.88

CollectCom transaction costs

Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳
1 56 524 25.24 7.32 0.43
2 114 640 32.28 9.39 0.51
3 170 747 40.11 11.64 0.59
4 224 857 53.97 15.37 0.73
5 281 969 58.10 16.81 0.78
6 340 1081 66.14 19.11 0.87
7 396 1192 82.66 23.46 1.04
8 449 1303 93.28 26.35 1.15

Cost of Increment Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 1805 24.24 8.05 0.48
2 1924 25.27 9.07 0.50
3 2059 27.14 10.30 0.53
5 2280 29.12 12.20 0.57
10 3093 38.79 18.79 0.74
41 7722 97.13 58.57 1.70

Cost of Decrement Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 597 22.77 7.34 0.41
2 734 23.54 8.20 0.43
3 819 24.02 9.00 0.45
5 1294 32.27 12.65 0.56
10 1873 37.55 17.45 0.66
39 6550 99.05 53.92 1.63

Close transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 662 29.03 9.25 0.48
2 819 31.39 10.74 0.52
3 1018 31.36 11.49 0.53
5 1278 34.75 13.88 0.59
10 1842 44.87 20.16 0.74
34 5661 91.80 51.98 1.52

Contest transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 700 33.73 10.52 0.53
2 821 35.77 11.79 0.56
3 1071 39.01 13.66 0.61
5 1224 41.75 15.62 0.65
10 2064 53.85 22.95 0.84
29 5096 99.37 50.42 1.54

Abort transaction costs

There is some variation due to the random mixture of initial and already committed outputs.

Parties Tx size % max Mem % max CPU Min fee ₳
1 5784 27.01 9.04 0.69
2 5961 36.92 12.40 0.80
3 5973 41.21 13.74 0.85
4 6043 48.88 16.29 0.93
5 6373 65.36 21.94 1.12
6 6284 64.74 21.58 1.11
7 6674 81.22 27.32 1.30
8 6709 91.20 30.58 1.40

FanOut transaction costs

Involves spending head output and burning head tokens. Uses ada-only UTXO for better comparability.

Parties UTxO UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳
10 0 0 5834 19.64 6.56 0.61
10 1 57 5869 20.34 6.91 0.62
10 5 284 6003 30.68 10.89 0.74
10 10 570 6174 38.64 14.15 0.84
10 30 1709 6856 80.27 30.53 1.32
10 39 2217 7156 98.99 37.90 1.54

End-to-end benchmark results

This page is intended to collect the latest end-to-end benchmark results produced by Hydra's continuous integration (CI) system from the latest master code.

Please note that these results are approximate as they are currently produced from limited cloud VMs and not controlled hardware. Rather than focusing on the absolute results, the emphasis should be on relative results, such as how the timings for a scenario evolve as the code changes.

Generated at 2025-05-29 14:17:36.741077813 UTC

Baseline Scenario

Number of nodes 1
Number of txs 300
Avg. Confirmation Time (ms) 4.525232713
P99 8.382652259999974ms
P95 5.530842250000001ms
P50 4.3063345ms
Number of Invalid txs 0

Memory data

Time Used Free
2025-05-29 14:16:20.299995449 UTC 0m 0;
2025-05-29 14:16:25.299819772 UTC 6 0;
2025-05-29 14:16:30.299781243 UTC 6 0;
2025-05-29 14:16:35.299869252 UTC 1 0;
2025-05-29 14:16:40.29986787 UTC 1 0;
2025-05-29 14:16:45.299814519 UTC 1 0;

Three local nodes

Number of nodes 3
Number of txs 900
Avg. Confirmation Time (ms) 29.001177215
P99 41.622463239999995ms
P95 38.33922515ms
P50 27.993236500000002ms
Number of Invalid txs 0

Memory data

Time Used Free
2025-05-29 14:16:58.605586778 UTC 0m 0;
2025-05-29 14:17:03.60592254 UTC 6 0;
2025-05-29 14:17:08.607284875 UTC 4 0;
2025-05-29 14:17:13.605831035 UTC 19 0;
2025-05-29 14:17:18.605826811 UTC 20 0;
2025-05-29 14:17:23.605938746 UTC 2 0;
2025-05-29 14:17:28.605746494 UTC 2 0;
2025-05-29 14:17:33.605842919 UTC 2 0;

Copy link

github-actions bot commented May 8, 2025

Transaction cost differences

No cost or size differences found

Comment on lines 33 to 38
currentEvents <- getEvents eventSource
let currentNumberOfEvents = toInteger $ length currentEvents
numberOfEventsV <- newTVarIO currentNumberOfEvents
-- XXX: check rotation on startup
when (currentNumberOfEvents >= toInteger rotateAfterX) $ do
rotateEventLog logIdV numberOfEventsV
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this as the application will call sourceEvents at least once (see rules above).

So indeed we should hook into the sourceEvents conduit to do our event counting (and maybe do rotation while sourcing!?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note this decision prevents making the "rotation on startup" logic outside this function (e.g., in hydrate)

mkRotatedEventStore rotationConfig checkpointer logId eventStore

mkChechpointer :: IsChainState tx => ChainStateType tx -> UTCTime -> Checkpointer (StateEvent tx)
mkChechpointer initialChainState time events =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the mkCheckpointer should be a function sitting in a Hydra.Node module as it is using the specific StateEvent type (Hydra.Events.Rotation is otherwise abstracted over e)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hydra.Node already depends on Hydra.Events.Rotation (EventStore) because hydrate takes it as argument.

This creates a cyclic module dependency.

>>= primeWith inputsToOpenHead
>>= runToCompletion
rotatedHistory <- getEvents (fst rotatingEventStore)
length rotatedHistory `shouldBe` 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is testing hydrate I think it should move to NodeSpec

Copy link
Contributor Author

@ffakenz ffakenz May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, although this is testing hydrate within the rotation context and that’s why I’d rather keep it here.

(y > x) ==> do
eventStore@(eventSource, eventSink) <- createMockSourceSink
let totalEvents = toInteger y
let events = TrivialEvent <$> [1 .. fromInteger totalEvents]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could: just generate two [TrivialEvent] directly. Then you also don't need to generate positive numbers (there are no negative number of list items).

Something like prop ... $ \toRotate additionalEvents -> let totalEvents = toRotate <> additionalEvents; let rotationConfig = RotateAfter (length toRotate) ...

Copy link
Contributor Author

@ffakenz ffakenz May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer using the positive numbers + delta trick, as I find it more explicit.

now <- getCurrentTime
let checkpointer = mkChechpointer initialChainState now
-- FIXME!
let logId = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.. so where should we start the log file numbering? :)

I see two options that does not require knowing about how many past log files do exist:

  • EventId of first or last event (which is strictly increasing)
  • Timestamp of when the log was created

Both don't require an initial LogId when creating the EventStore, but can be found when we decide to do a rotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ideas are good; anyhow, they still fall under the same inconvenience of having to identify the latest logId on startup.

@ffakenz ffakenz force-pushed the event-log-rotation branch 3 times, most recently from 40d126e to 246c011 Compare May 13, 2025 10:38
@ch1bo ch1bo removed this from the 0.22.0 milestone May 13, 2025
@ffakenz ffakenz force-pushed the event-log-rotation branch 2 times, most recently from 5a0aad9 to a798019 Compare May 13, 2025 17:01
@ffakenz ffakenz moved this from Triage 🏥 to In progress 🕐 in ☕ Hydra Team Work May 13, 2025
@ffakenz ffakenz marked this pull request as ready for review May 13, 2025 18:20
@ffakenz ffakenz force-pushed the event-log-rotation branch 2 times, most recently from eaa4416 to 6836989 Compare May 14, 2025 14:37
@ffakenz ffakenz requested a review from a team May 14, 2025 14:46
@ffakenz ffakenz moved this from In progress 🕐 to In review 👀 in ☕ Hydra Team Work May 14, 2025
@ffakenz ffakenz moved this from In review 👀 to In progress 🕐 in ☕ Hydra Team Work May 14, 2025
Copy link
Contributor

@v0d1ch v0d1ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@ffakenz ffakenz force-pushed the event-log-rotation branch from 7a33c93 to 321fea5 Compare May 15, 2025 18:51
@ffakenz ffakenz moved this from In progress 🕐 to In review 👀 in ☕ Hydra Team Work May 16, 2025
ffakenz added 29 commits May 29, 2025 16:10
* also move it as part of the algorithm spec
* to keep consistency with rest
* generate delta values to avoid discarding
* rename starting state file to have new logId suffix, even if rotation is disabled
> this allows to switch rotation config on/off
* add missing logId suffix to state files
* prevents unbounded memory usage during rotation check at startup.
@ffakenz ffakenz force-pushed the event-log-rotation branch from 81c5c9c to e94a4bb Compare May 29, 2025 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💬 feature A feature on our roadmap
Projects
Status: In review 👀
Development

Successfully merging this pull request may close these issues.

Event Log Rotation
4 participants